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SYSTEM AND METHOD FOR HIGH-SPEED COMMUNICATIONS 
BETWEEN AN APPLICATION PROCESSOR AND COPROCESSOR 



Technical Field 

Embodiments of the present invention pertain to communications between 
processors and coprocessors. 

Background 

Many processing systems utilize coprocessors and/or companion devices 
to offload some processing-intensive processing operations. For example in 
graphics processing operations, a graphics accelerator or graphics coprocessor 
may be used to perform graphics-intensive processing operations on behalf of an 
application processor. In conventional systems, the application processor 
communicates graphics command and/or control data as well as display data with 
the coprocessor over a system bus to allow the coprocessor to generate display 
data for a graphics display. In wireless communication devices, coprocessors 
and/or companion devices may be used to perform specific wireless related 
operations on behalf of the application processor. 

One problem with such conventional systems is that the bandwidth of the 
system bus may limit the ability of an application processor to utilize the full 
capability of the coprocessor or companion device. Thus there are general needs 
for systems and methods that provide greater bandwidth communication between 
processors and coprocessors. 

Brief Description of the Drawings 

The appended claims are directed to some of the various embodiments of 
the present invention. However, the detailed description presents a more complete 
understanding of embodiments of the present invention when considered in 
connection with the figures, wherein like reference numbers refer to similar items 
throughout the figures and: 

FIG. 1 is a block diagram of a communication device in accordance with 
embodiments of the present invention; 
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FIG. 2 is a block diagram of a processing system in accordance with 
embodiments of the present invention; and 

FIG. 3 is a flow chart of a procedure for communicating between a 
processor and a coprocessor in accordance with embodiments of the present 
5 invention. 

Detailed Description 
The following description and the drawings illustrate specific 
embodiments of the invention sufficiently to enable those skilled in the art to 

10 practice them. Other embodiments may incorporate structural, logical, electrical, 
process, and other changes. Examples merely typify possible variations. Individual 
components and functions are optional unless explicitly required, and the 
sequence of operations may vary. Portions and features of some embodiments may 
be included in or substituted for those of others. The scope of embodiments of the 

15 invention encompasses the full ambit of the claims and all available equivalents of 
those claims. 

FIG. 1 is a block diagram of a communication device in accordance with 
embodiments of the present invention. Communication device 100 may receive 
and/or transmit radio frequency (RF) communications with antenna 102. RF 

20 signals received from antenna 102 may be down-converted to data signals by RF 
conversion circuitry 104. Data signals may also be up-converted by RF conversion 
circuitry 104 for transmission by antenna 102. Processing system 106 may 
communicate data signals with other circuitry (not illustrated), and may further 
communicate input/output (I/O) data with I/O device 108. 

25 Although some embodiments of the present invention that apply to 

wireless communications and wireless communication devices are described 
herein, the scope of the invention is not limited in this respect. Embodiments of 
the present invention apply equally to wireline communication devices and 
processing systems. Although some embodiments of the present invention that 

30 apply to graphics processing and the generation of graphics data in systems using 
a graphics coprocessor or graphics companion device are described herein, the 
scope of the invention is not limited in this respect. Embodiments of the present 
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invention apply equally to processing systems that utilize almost any types of 
coprocessors and/or companion devices. 

In some embodiments, I/O device 108 may comprise one or more of 
almost any input device, output device or I/O device. Examples of devices suitable 
5 for use as device 108 include almost any type of information or graphics display 
including liquid crystal displays (LCDs), cathode ray tube (CRT) type displays, 
OLED, PLED, and MEMS displays, electrophoretic displays, electroluminescent 
displays, liquid crystal on silicon displays, grating displays, interferometric 
displays, field emissive device displays, etc., although the scope of the invention 

10 is not limited in this respect. Other examples of devices suitable for use as device 
108 include almost any type of I/O device including, for example, disk drives, 
smart card readers, retinal scanners, etc., although the scope of the invention is not 
limited in this respect. 

In some embodiments, processing system 106 includes an application 

1 5 processor and a companion device or coprocessor. In some embodiments, the 

companion device or coprocessor may be a graphics coprocessor which generates 
image data for displaying on I/O device 108, although the scope of the invention is 
not limited in this respect. In some embodiments, the application processor and 
companion device may communicate over a dedicated high-speed datapath. The 

20 use of the companion device may provide greater system performance while 
reducing power consumption. Furthermore, the use of the high-speed datapath 
may allow improved utilization of the companion device's ability and may also 
provide additional functionality. Some example embodiments of processing 
system 106 are described in more detail below. 

25 Communication device 100 may be a personal digital assistant (PDA), a 

laptop or portable computer with or without wireless communication capability, a 
web tablet, a wireless telephone, a wireless headset, a pager, an instant messaging 
device, a digital camera, or any device that may receive and/or transmit 
information wirelessly. In some embodiments, RF conversion circuitry 104 may 

30 transmit and/or receive RF communications in accordance with specific 

communication standards, such as the IEEE 802.1 1(a), 802.1 1(b) and/or 802.1 1(g) 
standards for wireless local area network standards, although circuitry 104 may 
also be suitable to transmit and/or receive communications in accordance with 
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other techniques including the Digital Video Broadcasting Terrestrial (DVB-T) 
broadcasting standard, and the High performance radio Local Area Network 
(HiperLAN) standard. Antenna 102 may comprise a directional or omni- 
directional antenna, including, for example, a dipole antenna, a monopole antenna, 
5 a loop antenna, a microstrip antenna or other type of antenna suitable for reception 
and/or transmission of RF signals which may be processed by RF conversion 
circuitry 104. 

Although communication device 100 is illustrated as a wireless 
communication device, device 100 may be almost any wireless or wireline 

10 communication device, including a general purpose processing or computing 
system. In some embodiments, device 100 may be a battery-powered device, 
although the scope of the invention is not limited in this respect. In some 
embodiments, device 100 may not require antenna 102 and may not require RF 
conversion circuitry 104. 

1 5 FIG. 2 is a block diagram of a processing system in accordance with 

embodiments of the present invention. In some embodiments, processing system 
200 may be suitable for use as processing system 106 (FIG. 1), although other 
processing systems are also suitable. Processing system 200 comprises application 
processor 202 and coprocessor 204 which generally communicate over system bus 

20 206. In some embodiments, coprocessor 204 may generate output data 207 for a 
device, such as I/O device 108 (FIG. 1). System memory 208 may be accessed by 
both application processor 202 and coprocessor 204. Coprocessor 204 may have 
local memory 210 for dedicated use by coprocessor 204. In some embodiments, 
coprocessor 204 may be a graphics accelerator and may generate output data 207, 

25 which may be image data, for a graphics display based on graphics command 
and/or control data and image data. 

In accordance with some embodiments of the present invention, 
application processor 202 includes interface 220, and coprocessor 204 includes 
interface 222. In some graphics embodiments, interfaces 220 and 222 may be 

30 graphics interfaces and may be configured to receive pixel-stream formatted data. 
In some embodiments, formatted graphics command and/or control data and/or 
formatted image data may be received at interface 222 from interface 220 over 
high-speed datapath 214. In some embodiments, coprocessor 204 may have 

Attorney Docket No. 884.897US1 4 Client Ref. No. PI 5541 



display interface 224 to provide display data 207 to an I/O device, which may be a 
graphics display. High-speed datapath 214 may communicate any data, including 
command and/or control data between interface 220 and interface 222. In some 
graphics embodiments, the data may be formatted as a pixel stream which may use 
5 the organization and timing of a pixel stream as transmitted to an information 
display device, although the scope of the invention is not limited in this respect. 

In some embodiments, application processor 202 may drive a first display 
and coprocessor 204 may drive a second display. Examples of these embodiments 
include systems utilizing more than one display such as a "clamshell" wireless 
10 telephone, or a PDA or laptop computer coupled to a projection device. In these 
embodiments, processor 202 through interface 220 and datapath 214 may drive 
both the first display and interface 222, while coprocessor 204 and interface 224 
may drive the second display based on command and/or control data received over 
datapath 214 . 

1 5 In some graphics embodiments, graphics interface 220 may comprise 

drivers 226 to receive graphics command and/or control data from processing core 
228 of application processor 202 and to format the graphics command and/or 
control data into the pixel-stream formatted graphics command and/or control 
data, although the scope of the invention is not limited in this respect. In some 

20 embodiments, graphics interface 222 may comprise drivers 230 to reformat (or 
unbundle) the pixel-stream formatted graphics command and/or control data back 
to the graphics command and/or control data, although the scope of the invention 
is not limited in this respect. In some embodiments, drivers 226 and 230 may 
comprise hardware and/or software components. In some graphics embodiments, 

25 interface 220 may be an LCD controller output of an application processor 
suitable for directly interfacing with an LCD. 

Coprocessor 204 may further comprise coprocessor processing core 232, 
which among other things, may respond to the command and/or control data and 
other data received from application processor 202 to generate the data for an I/O 

30 device such as a graphics display. In some graphics embodiments, coprocessor 
processing core 232 may include a graphics accelerator to offload at least some 
graphics-processing operations from the application processor. The graphics- 
processing operations may include two-dimensional (2D) graphics operations, 
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three-dimensional (3D) graphics operations, multimedia encoding and decoding 
operations, and display refresh operations, although the scope of the invention is 
not limited in this respect. The graphics-processing operations may be indicated 
by graphics command and/or control data. In some embodiments, the graphics 
5 command and/or control data may comprise commands and controls to instruct 
the coprocessor to perform graphics-processing operations. 

In some embodiments, application processor 202 may include on-die 
memory 234, which may, for example comprise SRAM or FLASH memory, 
although the scope of the invention is not limited in this respect. Application 

1 0 processor 202 may perform a DMA transfer of data using memory controller 2 1 9 
from memory 208 and/or on-die memory 234 to coprocessor 204 over high-speed 
datapath 214, although the scope of the invention is not limited in this respect. In 
some embodiments, application processor 202 may selectively refrain from 
transferring data to coprocessor 204 over the system bus 206 to improve 

15 performance. 

In some embodiments, coprocessor 204 may be an integrated part of a 
graphics display, although the scope of the invention is not limited in this respect. 
In these embodiments, the graphics display may include photodiodes, which may 
allow the display to operate as a scanner to generate image data. Coprocessor 204 

20 may convert the image data to pixel-stream formatted image data to transfer over 
the high-speed datapath 214 to application processor 202 for further use and/or 
processing. 

In some embodiments, the display data generated by coprocessor 204 for 
an I/O device may comprise raw pixel data describing each pixel of the graphics 
25 display in a per-pixel format. The pixel-stream formatted image data may 

comprise pixel data in a pixel format, and the pixel-stream formatted command 
data may comprise command data in pixel format. 

In some embodiments, coprocessor 204 may comprise a graphics 
accelerator, a hardware accelerator, or a companion device. In some embodiments, 
30 interface 220 may be a LCD controller interface or a graphics output interface, and 
interface 222 may be a graphics port or a graphics input interface. 

Some communications between application processor 202 and coprocessor 
204 may take place over system bus 206 and may utilize system bus cycles in a 
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conventional manner. Although system 200 is illustrated as having system bus 
206, embodiments of the present invention apply to the use of any communication 
link or interconnect structure for communications between the various elements. 
Coprocessor 204 may have associated decoders and drivers as part of 
5 system memory interface 216 for interfacing with system bus 206 and as part of 
local memory interface 218 for interfacing with local memory bus 212. In some 
embodiments, coprocessor 204, memory interfaces 216, 218 and local bus 212, as 
well as other elements not illustrated, may be located on a separate chip. In some 
embodiments, local memory 210 may be an off-chip memory or memory 

1 0 structure. In some embodiments, coprocessor 204 may include on-die memory 
which may be utilized in addition to an off-die memory, such as memory 210. 

In some embodiments, datapath 214 may be a high-speed serial datapath. 
In some embodiments, datapath 214 may comprise a pair of conductors suitable to 
carry high-speed digital differential signals, although the scope of the invention is 

15 not limited in this respect. In other embodiments, datapath 214 may be a parallel 
datapath or bus. Datapath 214 may be supported by a logic interface, a current- 
mode interface or a fiber-optic interface, although the scope of the invention is not 
limited in these respects. 

In some embodiments, a coprocessor developer may have defined the 

20 syntax for the data stream communications for interface 222. In some 

embodiments, the developer may have defined a specific protocol for such 
communications, which may be a simple or a complicated transaction sequence. In 
some embodiments, a software driver, which may be one of drivers 226, may 
package data into a format in accordance with the syntax requirements of 

25 coprocessor 204, and the driver may copy this formatted data stream into a buffer, 
such as a LCD frame buffer in the case of LCD devices. The formatted data may 
be directly transmitted from the buffer to coprocessor 204 over datapath 214. 
Coprocessor 204 may decode the bit-stream based on the predetermined syntax. 
Because the syntax may depend on the particular coprocessor, embodiments of the 

30 present invention do not require any particular syntax describing the 

communications over datapath 214. For example, in some embodiments, a 
communications protocol stack may be utilized to package data for transmission 
over a serial link (not illustrated). Instead of using the serial link, the data stream 
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may be directly transmitted via datapath 214 by copying the data stream into the 
LCD frame buffer rather than sending the data stream serially over the serial link. 

Examples of applications which may run on application processor 202 may 
depend on the primary purpose of system 200. For example, when system 200 is 
5 part of a personal computer or processing system, applications may include any 
software program running thereon. When system 200 is part of a wireless 
communication device (e.g., PDA, wireless telephone, web tablet), applications 
may include software and programs that relate to wireless communications. When 
system 200 serves as a microcontroller, applications may include dedicated 

1 0 control-type applications. 

In some embodiments, coprocessor 204 may be used to perform tasks that 
could be performed by the application processor, such as repetitive tasks that may 
require system memory access. Examples of such tasks include display refresh 
(e.g.j for a graphics chip), and other graphics intensive operations which may 

15 require access to local memory 210 or may have bus mastering capabilities to use 
system memory 208. This offloading of tasks from the application processor to the 
coprocessor may reduce power consumption because the application processor 
may, for example, be turned off or not used during these operations. For example, 
the application processor may not be needed for display refresh operations. This 

20 offloading may also free up processing cycles of the application processor and free 
up system bandwidth allowing for faster and more efficient processing by the 
application processor. In the case of a wireless device or chip, the companion 
device may maintain wireless network connectivity while the application 
processor sleeps, and may wake-up the application processor, for example, when 

25 information is received over the network. Portions of local memory 210 may be 
accessed by coprocessor 204 for these offloaded operations, although in some 
embodiments, the companion device may also utilize portions of system memory 
208 over datapath 214 (rather than over system bus 206) when additional memory 
is required, although the scope of the invention is not limited in this respect. 

30 In some embodiments, high-speed datapath 214 may relieve system bus 

206 of display-refresh traffic, which may comprise data rates of up to 250 
Mbytes/Sec and even greater. In some embodiments, the display refresh function 
may be performed almost entirely by coprocessor 204 and iterations of a displayed 
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image may be passed over datapath 214 to further relieve use of system bus 206. 
Accordingly, system bus 206 may no longer limit the graphics capability of a 
device, and substantially more image data and more processed image data may be 
presented to a display through use of datapath 214. 
5 In some embodiments, coprocessor 204 and associated local memory 210 

may be part of a separate semiconductor device/chip or card from other system 
elements, although the scope of the invention is not limited in this respect. In 
some embodiments, dynamic random access memory (DRAM), SDRAM or Flash, 
as well as other types of memory, and combinations thereof may be suitable for 

1 0 use for system memory 208 and local memory 2 1 0. 

In some embodiments, graphics primitives may be block transferred from 
application processor 202 to coprocessor 204 over datapath 214. In some 
embodiments, graphics primitives may be block transferred by first storing a block 
of graphics primitives in a memory array local to application processor 202. This 

1 5 block of data may then be transferred to the coprocessor via a software memory 
copy operation or a hardware DMA operation, although the scope of the invention 
is not limited in this respect. If necessary, the data may be formatted and 
organized to fit the form of a pixel-data stream before being transferred across the 
interface. Although multimedia data may include many different content data 

20 types, it may be formatted and organized in the same or similar manner. 

In some wireless embodiments, coprocessor 204 may be a wireless 
companion chip to perform wireless-specific tasks, such as maintaining network 
connectivity or generating encoded voice or data for wireless communications. In 
these wireless embodiments, application processor 202 may communicate wireless 

25 data with coprocessor 204 in a data-stream format over high-speed datapath 214. 
In some of these wireless embodiments, interface 224 may provide an interface to 
RF circuitry, such as RF circuitry 104 (FIG. 1) including associated off-chip 
components, such as a low-noise amplifier (LNA), power amplifier (PA), RF 
switches, filters, and/or an antenna. The functions of these elements may not 

30 necessarily be integrated into coprocessor 204. In these embodiments, core 232 
may perform integrated wireless transceiver functions. In these embodiments, the 
data communicated over datapath 214 may include digitally encoded data or voice 
communication signals. In these embodiments, transceiver functions may be 
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allocated between application processor 202 and coprocessor 204 and may be 
allocated in a manner similar to the graphics functions previously described. In 
these embodiments, functions provided by separate functional blocks controlled 
by a single processor including some of the modulation and demodulation 
5 functions may be performed in software and/or hardware within application 
processor 202, instead of all functions being performed in the transceiver's 
coprocessing functions. In other words, functions that were performed using a 
single processor may be divided into separate function processing subsystems 
each with their own coprocessor. 

10 Although system 200 is illustrated as having several separate functional 

elements, one or more of the functional elements may be combined and may be 
implemented by combinations of software-configured elements, such as 
processing elements including digital signal processors (DSPs), and/or other 
hardware elements. For example, processing elements may comprise one or more 

15 microprocessors, DSPs, application specific integrated circuits (ASICs), and 
combinations of various hardware and logic circuitry for performing at least the 
functions described herein. 

Unless specifically stated otherwise, terms such as processing, computing, 
calculating, determining, displaying, or the like, may refer to an action and/or 

20 process of one or more processing or computing systems or similar devices that 
may manipulate and transform data represented as physical (e.g., electronic) 
quantities within a processing system's registers and memory into other data 
similarly represented as physical quantities within the processing system's 
registers or memories, or other such information storage, transmission or display 

25 devices. Furthermore, as used herein, computing or processing device or system 
includes one or more processing elements coupled with computer readable 
memory that may be volatile or non- volatile memory or a combination thereof. 

FIG. 3 is a flow chart of a procedure for communicating between a 
processor and a coprocessor in accordance with embodiments of the present 

30 invention. Procedure 300 may be performed by a system having an application 
processor and coprocessor coupled by a high-speed datapath, such as system 200 
(FIG. 2), although other systems may be suitable for performing procedure 300. 
Although procedure 300 is described for some graphics embodiments, the scope 

Attorney Docket No. 884.897US1 10 Client Ref. No. PI 5541 



of the present invention is not limited in this respect, Procedure 300 is applicable 
to other embodiments. 

Operation 302 generates command and/or control data with the application 
processor. The command and/or control data may be graphics command and/or 
5 control data to be utilized by a graphics coprocessor. In operation 304, the 

application processor may determine whether to send the command and/or control 
data to the coprocessor over a system bus, such as bus 206 (FIG. 2), or over a 
high-speed link, such as datapath 214 (FIG. 2). As part of operation 304, the 
application processor may refrain from sending graphics command and/or control 
10 data and/or image data over the system bus to free up the system bus for other 

system operations. When the application processor decides to utilize a high-speed 
datapath instead of the system bus, at least operations 306 through 314 may be 
performed. 

Operation 306 formats the command and/or control data into a format for 

15 transfer over a high-speed datapath to a coprocessor. In some embodiments, the 
command and/or control data may be formatted into a pixel-stream format (e.g., to 
look like raw pixel data). In operation 308, the formatted command and/or control 
data may be buffered in a buffer memory and in operation 310, the formatted 
command and/or control data may be transferred from the buffer over the high- 

20 speed datapath to the coprocessor. 

In operation 312, the formatted command and/or control data and/or 
image data may be received by the coprocessor over the high-speed datapath (e.g., 
rather than the system bus). In operation 314, the coprocessor may reformat the 
received data to extract the original command and/or control data. In operation 

25 316, the coprocessor may perform the operations indicated by the command 
and/or control data and may generate display data in operation 3 1 8 to drive a 
display or other I/O device. 

Although the individual operations of procedure 300 are illustrated and 
described as separate operations, one or more of the individual operations may be 

30 performed concurrently and nothing requires that the operations be performed in 
the order illustrated. 

It is emphasized that the Abstract is provided to comply with 37 C.F.R. 
Section 1 .72(b) requiring an abstract that will allow the reader to ascertain the 
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nature and gist of the technical disclosure. It is submitted with the understanding 
that it will not be used to limit or interpret the scope or meaning of the claims. 

In the foregoing detailed description, various features are occasionally 
grouped together in a single embodiment for the purpose of streamlining the 
5 disclosure. This method of disclosure is not to be interpreted as reflecting an 

intention that the claimed embodiments of the subject matter require more features 
that are expressly recited in each claim. Rather, as the following claims reflect, 
inventive subject matter lies in less than all features of a single disclosed 
embodiment. Thus the following claims are hereby incorporated into the detailed 
1 0 description, with each claim standing on its own as a separate preferred 
embodiment. 
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